Skip to content

[testnet] Use streaming DownloadBlobs RPC for batch blob downloads#6476

Draft
ndr-ds wants to merge 2 commits into
testnet_conwayfrom
ndr-ds/backport-streaming-download-blobs-caller
Draft

[testnet] Use streaming DownloadBlobs RPC for batch blob downloads#6476
ndr-ds wants to merge 2 commits into
testnet_conwayfrom
ndr-ds/backport-streaming-download-blobs-caller

Conversation

@ndr-ds

@ndr-ds ndr-ds commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Motivation

The streaming DownloadBlobs validator endpoint was backported to
testnet_conway in #6156, but nothing on the client uses it yet — chain
sync still downloads blobs one unary download_blob RPC at a time. This PR
backports the caller so testnet_conway clients (notably the PM workers)
benefit from the streaming endpoint.

Proposal

Client-side only. Cherry-pick of the main change (#6157):

  • download_blobs issues one streaming DownloadBlobs request per validator
    (tried in shuffled order) carrying all still-missing blob IDs, instead of
    one unary RPC per blob. Falls back to the next validator only for IDs not
    yet returned, preserving partial progress.
  • The per-attempt deadline is progress-based: blob_download_timeout bounds
    the wait for the stream to open and for each subsequent item, so a stalled
    peer is abandoned quickly while a transfer that is still making progress is
    never cancelled. (A whole-attempt cutoff would make batches larger than ~1s
    of transfer time permanently undownloadable with the default setting.) Flag
    docs and CLI.md updated to match.
  • Deduplicate requested blob IDs up front (BTreeSet) so a repeated ID can't
    abort the stream and fail the batch — process_certificates does pass
    duplicates, since it flat-maps required_blob_ids across certificates.
  • An attempt that ends because the peer sent an unexpected or duplicate blob
    counts as a failed interaction in that peer's EMA metrics.

Unlike #6157, this backport does not remove the now-unused per-blob
client path (RemoteNode::download_blob, RequestKey::Blob,
RequestResult::Blob) — keeping the backport surface minimal.

What this optimizes (and what it does not). This reduces client→validator
RPC fan-out from N unary calls to one streamed call per validator. It does
not reduce validator-side storage work: the handler still performs N
parallel single-partition point reads, because each blob lives under its own
root_key partition and can't be coalesced into a single IN query. That
storage cost is also largely irrelevant in practice — the proxy serves these
blobs ~99.8% from its in-memory cache (testnet-conway production,
linera_read_blob cache vs db over 24h). The win therefore scales with the
number of blobs per batch: meaningful for blob-heavy cold syncs (PM
market/worker chains), a no-op for blob-light root chains.

No protocol/storage-format change; the server side is unchanged and already
deployed on testnet_conway.

Test Plan

  • No regression on blob-light chains (measured). Fresh-wallet/fresh-DB
    sync of root chain d45db728… (8000 blocks), old vs new binary, both
    ~16 blocks/s. That sync pulled exactly one blob (ChainDescription at
    genesis), so the batching has nothing to collapse — as expected for root
    chains.
  • Expected win on PM chains (not yet measured end-to-end). Production
    linera_write_blob shows PM workers writing 50k–230k blobs per ~2-day
    cold-start window; each was a separate unary RPC before, and now streams.
    A head-to-head blocks/s number requires PM wallet keys or a local net with
    a published app + cross-chain blob traffic; tracked as follow-up.
  • cargo test -p linera-core --lib requests_scheduler (27 tests) and
    cargo clippy clean; CLI.md regenerated.

Release Plan

Links

@ndr-ds ndr-ds force-pushed the ndr-ds/backport-streaming-download-blobs-caller branch from 0ecf301 to 8c17e5c Compare June 10, 2026 14:44
@ndr-ds ndr-ds marked this pull request as ready for review June 10, 2026 15:32
@ndr-ds ndr-ds marked this pull request as draft June 11, 2026 17:32
ndr-ds added a commit that referenced this pull request Jun 12, 2026
…d toolchain (#6489)

## Motivation

The `lint-check-for-outdated-readme` job is now failing
deterministically on every `testnet_conway` PR. `cargo install
cargo-rdme --locked` resolves to the freshly released cargo-rdme 1.5.1,
whose lockfile pulls `zmij 1.0.21`, which uses
`hint::select_unpredictable` — stabilized in Rust 1.88. This branch's
pinned toolchains (stable 1.86.0, nightly-2025-04-03) cannot build it,
so the install step exits 101 before the README check even runs. The
same job passed on this branch as recently as 2026-06-11 (PRs #6474,
#6476) and broke on 2026-06-12 (#6488), bracketing the upstream release.
`main` is unaffected (toolchain 1.95).

## Proposal

Pin the tool: `cargo install cargo-rdme --locked --version 1.5.0`, with
a comment explaining why. cargo-rdme 1.5.0 declares `rust-version =
1.86.0`.

## Test Plan

Verified locally that `cargo install cargo-rdme --locked --version
1.5.0` builds successfully on rustc 1.86.0 (this branch's stable pin)
and the binary runs. This PR's own `lint-check-for-outdated-readme` job
exercises the pinned install directly, so its CI result is the
end-to-end proof.

## Release Plan

- CI-only change on `testnet_conway`; nothing to deploy. `main` does not
need this (newer toolchain), though the pin could be ported there for
reproducibility if desired.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant